*Generates FIGURE 1
*Version 15 Stata

set more off

*Establish working directories
local d3 Geographic Data
local d5 Antidepressant Data
local d22 Entropy Data

*Bring in data
*NOTE: These data are created by the do-file entropy_data_final.do. They are in
*	product, patient, provider, and year level, with several definitions of 
*	drug product.
cd `d22'
use entropy_data_final_we.dta, clear

*Define product
local drug product3

*Parse variables
keep year tot_scripts provider_id product3

*Define global mental health drug products across years
local drug_products=33

*Generate yearly script variable by doctor
by provider_id year, sort: egen annual_scripts = sum(tot_scripts)

*Define "other" drug category
generate other = cond(product3!="SERTRALINE HCL" & product3!="CITALOPRAM HBR" & product3!="FLUOXETINE HCL" & product3!="ESCITALOPRAM OXAL" & product3!="Paroxetine" & product3!="Venlafaxine" & product3!="DULOXETINE HCL" & product3!="Bupropion" & product3!="TRAZODONE HCL" & product3!="AMITRIPTYLINE HCL" & product3!="MIRTAZAPINE", 1, 0)

*Generate yearly script variable by doctor GIVEN APA GUIDELINE
generate tot_scripts_apa = tot_scripts
replace tot_scripts_apa = 0 if product3=="TRAZODONE HCL" | product3=="AMITRIPTYLINE HCL" | other==1

*Calculate entropy for physician vector, as is

preserve

*Collapse to provider-year-county level
collapse (sum) tot_scripts, by(year `drug' provider_id)

*Generate total yearly prescriptions by county
sort provider_id year
by provider_id year: egen total = sum(tot_scripts)

*Generate entropy measures
generate share = tot_scripts/total
generate temp2 = -(share*ln(share))/ln(`drug_products')
sort provider_id year
by provider_id year: egen entropy_all_global=sum(temp2)

*Keep one record per provider/year
sort provider_id year
bysort provider_id year: keep if _n==1

*Parse variables
keep provider_id year entropy_all_global total
rename entropy_all_global entropy_orig

*Smooth total variable
replace total = round(total)
rename total total_overall

*Save resulting file
tempfile original
save `original', replace

restore

*Calculate entropy for physician vector, APA

preserve

*Collapse to provider-year-county level
collapse (sum) tot_scripts_apa, by(year `drug' provider_id)

*Generate total yearly prescriptions by county
sort provider_id year
by provider_id year: egen total = sum(tot_scripts_apa)

*Generate entropy measures
generate share = tot_scripts/total
generate temp2 = -(share*ln(share))/ln(`drug_products')
sort provider_id year
by provider_id year: egen entropy_all_global=sum(temp2)

*Keep one record per provider/year
sort provider_id year
bysort provider_id year: keep if _n==1

*Parse variables
keep provider_id year entropy_all_global total
rename entropy_all_global entropy_apa

*Smooth total variable
replace total = round(total)
rename total total_apa

*Save resulting file
tempfile apa
save `apa', replace

restore

*Merge tempfiles together
use `original', clear
merge 1:1 provider_id year using `apa'
keep if _merge==3
drop _merge

*Generate indicators for valid observations
generate valid_apa = total_apa>0

*Make graph
*Define colors
local color1 "186 228 188"
local color2 "123 204 196"
local color3 "67 162 202"
local color4 "8 104 172"

local year = 2013

*apa
twoway (histogram entropy_orig if year==`year' & valid_apa==1 [fw=total_overall], fraction start(0) width(0.01) color("`color3'")) (histogram entropy_apa if year==`year' & valid_apa==1  [fw=total_overall], fraction start(0) width(0.01) fcolor(none) lcolor("`color4'")), legend(order(1 "Original" 2 "APA" )) xtitle("Physician entropy") note("Data from `year'." "Product definition: Unique molecules (n=32)" "Includes all physicians in US.", size(tiny))